225 research outputs found

    Generalised fourier analysis of human chromosome images

    Get PDF

    Measuring Societal Biases in Text Corpora via First-Order Co-occurrence

    Full text link
    Text corpora are used to study societal biases, typically through statistical models such as word embeddings. The bias of a word towards a concept is typically estimated using vectors similarity, measuring whether the word and concept words share other words in their contexts. We argue that this second-order relationship introduces unrelated concepts into the measure, which causes an imprecise measurement of the bias. We propose instead to measure bias using the direct normalized co-occurrence associations between the word and the representative concept words, a first-order measure, by reconstructing the co-occurrence estimates inherent in the word embedding models. To study our novel corpus bias measurement method, we calculate the correlation of the gender bias values estimated from the text to the actual gender bias statistics of the U.S. job market, provided by two recent collections. The results show a consistently higher correlation when using the proposed first-order measure with a variety of word embedding models, as well as a more severe degree of bias, especially to female in a few specific occupations

    Morphological Segmentation on Learned Boundaries

    No full text
    International audienceColour information is usually not enough to segment natural complex scenes. Texture contains relevant information that segmentation approaches should consider. Martin et al. [Learning to detect natural image boundaries using local brightness, color, and texture cues, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (5) (2004) 530-549] proposed a particularly interesting colour-texture gradient. This gradient is not suitable for Watershed-based approaches because it contains gaps. In this paper, we propose a method based on the distance function to fill these gaps. Then, two hierarchical Watershed-based approaches, the Watershed using volume extinction values and the Waterfall, are used to segment natural complex scenes. Resulting segmentations are thoroughly evaluated and compared to segmentations produced by the Normalised Cuts algorithm using the Berkeley segmentation dataset and benchmark. Evaluations based on both the area overlap and boundary agreement with manual segmentations are performed
    corecore